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Major  Accomplishments:  The  following  key  research  activities  have  been  accomplished  as  a 
result  of  this  project. 

1 .  Log  segmentation  models  using  (a)  dependencies  among  data  items,  (b)  fixed  number  of 
transactions,  (c)  fixed  time  window,  and  (d)  fixed  file  size  have  been  developed.  Each  of 
these  methods  organizes  the  database  log  file  for  faster  access  during  damage  assessment. 

2.  A  formal  approach  to  data  dependency  method  has  been  established.  A  concept  called 
Directed  Damage  Demonstration  Graph  has  been  developed  which  aids  in  faster  and 
more  accurate  damage  assessment. 

3.  For  faster  recovery  a  transaction  fusion  technique  has  been  developed.  This  method 
combines  several  transactions  and  executes  them  as  one  transaction  during  recovery. 

4.  A  data  reduction  technique  has  been  developed  which  uses  simple  and  compact  data 
structures  for  faster  damage  assessment.  Any  of  these  structures  can  be  used  during 
damage  assessment  instead  of  the  scanning  the  huge  log  file. 

5.  A  data  classification  approach  has  been  designed  in  order  to  manage  transactions  in  the 
database  in  a  secure  manner.  An  appropriate  access  control  mechanism  has  been 
developed. 

6.  Logging  transactions’  activities  using  their  semantics  is  necessary  for  accomplishing 
complete  and  accurate  recovery  of  a  damaged  system.  Appropriate  models  have  been 
developed  in  this  project. 

Executive  Summary 

In  order  to  expedite  damage  assessment  techniques  have  been  developed  for  segmenting 
the  log  into  multiple  smaller  files.  During  recovery  segments  containing  damaged  data  items  are 
accessed  and  those  data  items  are  recovered.  The  first  method  for  log  segmentation  uses  the  data 
dependency  relationships  and  stores  related  transaction  operations  in  one  segment.  In  this  effort, 
the  main  objective  was  to  segregate  the  related  data  items  with  accuracy.  An  algorithm  to  cluster 
the  log  by  grouping  the  related  operations  based  on  data  dependency  has  been  developed. 
Specific  measures  have  been  heeded  so  that  only  the  affected  data  operations  would  be 
considered  during  recovery  once  the  attacking  transaction  is  identified.  This  facilitates  skipping 
those  parts  of  the  log  that  are  unaffected  during  damage  assessment  and  recovery.  An  algorithm 
for  recovery  using  the  clusters  has  also  been  offered.  Recovery  and  damage  assessment  by  this 
model  demonstrated  outstanding  performance  gain  over  the  traditional  log  approach.  Various 
concepts  have  been  presented  that  would  enhance  the  damage  assessment  process.  During 


recovery  one  of  the  major  concern  is  denial  of  service.  Making  the  database  available  is  equally 
important  as  keeping  it  consistent.  The  concepts  such  as  critical  links  and  cliques  would  help  in 
carrying  out  faster  damage  assessment  and  making  the  data  items  available  at  the  earliest.  These 
concepts  have  been  developed  in  a  generic  manner,  hence  can  be  used  along  with  other  recovery 
models  where  the  damage  assessment  during  recovery  is  quite  time  consuming.  These  concepts 
must  be  applied  to  the  database  at  the  time  of  the  design.  The  granularity  of  the  nodes  defined  in 
those  concepts  may  be  changed  in  accordance  to  the  project  pursuit. 

One  of  the  problems  with  data  dependency  based  log  segmentation  is  that  it  uses 
considerable  computation  in  determining  the  dependency  relationships,  thereby,  slowing  down 
transaction  execution.  Although  the  method  significantly  accelerates  damage  assessment 
process,  in  general,  the  payoff  would  not  be  substantial  unless  attack  detections  are  frequent.  In 
order  to  alleviate  these  problems,  focus  was  given  on  how  to  segment  the  log  based  on  criteria 
that  do  not  use  much  of  the  system  time  but  yet,  serve  the  purpose  of  quick  damage  assessment. 
In  the  first  model,  segmentation  of  the  log  based  on  number  of  committed  transactions  has  been 
proposed.  A  segment,  which  is  called  tuft,  stored  a  fixed  number  of  committed  transactions  in 
this  case.  In  the  second  approach,  the  tuft  was  built  based  on  a  fixed  time  window.  All 
transactions  that  commit  within  a  time  window  are  added  to  a  tuft  in  this  situation  In  the  third 
approach,  a  constant  tuft  size  was  maintained  and  operations  of  committed  transactions  that  fit 
into  the  tuft  were  recorded.  It  was  assumed  that  the  tuft  size  was  bigger  than  the  largest 
transaction.  In  this  method  transactions  were  not  allowed  to  span  through  multiple  segments 
though  it  would  have  saved  disk  space  in  order  to  simplify  the  process  of  damage  assessment. 
Algorithms  have  been  developed  for  damage  assessment  that  could  be  used  with  a  log  that  is 
segmented  based  on  any  of  the  three  approaches.  The  damage  assessment  algorithm  generates  a 
list  of  all  the  malicious  and  affected  transactions.  Using  this  information,  any  of  the  previously 
proposed  approaches  can  be  used  to  carry  out  the  recovery  process.  Through  simulation  it  has 
been  proved  that  damage  assessment  process  with  a  segmented  log  is  definitely  faster  than  the 
traditional  approach  where  the  log  is  not  segmented.  Damage  assessment  on  a  log  segmented 
based  on  the  size  of  the  tufts  is  quicker  because  the  number  of  bytes  read  from  the  log  to  find  the 
affected  transactions  is  much  less.  It  did  not  take  as  many  bytes  as  the  damage  assessment  based 
on  number  of  committed  transactions  nor  did  it  show  random  behavior  as  in  the  case  of  damage 
assessment  based  on  a  time  window  for  the  tufts.  The  tufts  based  on  all  three  methods  were  built 
using  the  same  log  and  hence  the  damage  assessment  process  could  be  compared  with  each 
other. 


Necessary  theory  and  concepts  have  been  developed  to  make  the  data  dependency 
approach  more  robust  and  general.  These  include  classifications  of  read  and  write  operations,  a 
new  definition  of  transaction  and  a  new  representation  of  the  scheduler.  The  proposed  scheduler 
stores  more  pertinent  information  than  the  conventional  scheduler.  Based  on  the  theoretical 
support,  appropriate  damage  assessment  and  recovery  algorithm  has  been  designed.  This 
algorithm  considers  dependencies  among  data  items  accessed  by  various  transactions  to  precisely 
identify  affected  data  items  in  a  damaged  database  and  restores  them  to  their  consistent  values. 

Recovery  is  one  of  the  main  phases  in  defensive  information  warfare,  and  must  be  carried 
out  in  the  shortest  time  possible  to  minimize  denial  of  service.  The  recovery  process  involves 
undoing  of  malicious  and  affected  transactions  and  redoing  of  affected  transactions.  A  model 


that  fuses  each  set  of  malicious  transactions  or  affected  transactions  occurring  in  groups  into  a 
single  fused  transaction  has  been  designed.  These  fused  malicious  and  affected  transactions  are 
undone  in  the  undo  process  and  then  the  fused  affected  transactions  are  re-executed  in  the  redo 
process.  As  the  number  of  transactions  and  total  number  of  operations  are  minimized,  executing 
these  new  sets  of  fused  transactions  during  recovery  expedites  the  process.  An  algorithm  based 
on  the  concept  of  transaction  fusion  has  been  developed.  The  method  aimed  at  recovering  the 
system  affected  by  a  malicious  activity.  A  simulation  model  was  constructed  to  evaluate  the 
performance  of  the  transaction  fusion  model.  The  results  from  the  simulation  model  showed  that 
transaction  fusion  model  always  performed  better  than  the  traditional  approaches. 

Further  more,  several  data  structures  have  been  developed  to  store  transaction 
relationships  in  various  formats.  During  damage  assessment,  any  of  these  structures  can  be  used 
to  detect  the  set  of  affected  transactions  without  accessing  the  log.  Efficiency  of  these  auxiliary 
structures  has  been  tested  through  simulation.  Four  algorithms  have  been  developed,  each  of 
which  stores  dependency  relationships  among  transactions  in  a  unique  list  format.  In  a  post¬ 
intrusion  detection  phase,  the  stored  dependency  list,  which  is  extremely  compact  in  size,  is 
accessed  for  damage  assessment.  As  a  result,  the  damage  assessment  process  becomes  much 
faster  using  the  developed  model  compared  to  the  methods  that  uses  traditional  log  with  huge 
amounts  of  data.  Consequently,  the  unaffected  portion  of  the  database  can  be  made  available  to 
the  users  quickly.  A  simulation  of  the  model  was  developed,  which  proved  the  efficiency  of  the 
model.  This  model  could  be  applied  to  the  dependency  based  logging  approach  without  any 
significant  change. 

A  mechanism  for  classifying  data  into  rigid  and  regular  categories  has  been  developed. 
Using  this  model,  a  regular  user  can  read  both  the  rigid  and  regular  category  data  sets  but  could 
only  write  in  the  regular  category  data.  Only  a  very  small  group  of  people,  people  such  as 
database  administrators,  for  example,  has  write  access  to  data  in  the  rigid  category.  The  most 
obvious  reason  for  developing  this  protection  mechanism  was  to  prevent  intentional  violation  of 
an  access  restriction  by  a  user.  It  is  essential  to  ensure  that  each  active  transaction  uses  data  only 
in  ways  consistent  with  the  stated  policy  for  use  of  the  data.  A  user  is  allowed  to  exercise  the 
right  on  a  data  object  based  on  the  right  (s)he  has  on  that  object.  To  facilitate  that  scheme  a 
protection  domain  has  been  defined  and  suitable  access  matrix  has  been  developed.  When  a  user 
executes  a  transaction,  the  protocol  performs  some  simple  boolean  operations  to  determine 
whether  to  accept  or  to  reject  the  transaction.  This  classification  model  helps  contain  the  damage 
to  a  limited  area  of  database  in  case  of  an  attack,  thus,  improving  the  efficiency  of  damage 
assessment  and  recovery  process. 

Traditional  logs  are  inadequate  for  recovery  from  information  attacks  since  they  do  not 
contain  pertinent  information.  A  semantically  rich  logging  protocol  that  records  all  necessary 
information  required  for  the  complete  repair  of  databases  that  suffered  from  malicious  attacks 
has  been  devised.  Based  on  this  log,  a  mechanism  for  recovery  from  system  failures  has  been 
developed.  In  addition,  two  methods  are  proposed  for  complete  damage  assessment  and 
recovery  of  an  attacked  database.  The  first  method  performs  damage  assessment  and  recovery 
concurrently,  while  the  second  method  performs  these  tasks  sequentially.  Both  mechanisms 
work  very  efficiently  by  re-executing  the  damaged  parts  of  an  affected  transaction  rather  than  the 
entire  transaction.  The  unaffected  parts  of  transactions  are  carefully  identified  and  avoided.  The 


first  method  requires  that  the  entire  system  enter  a  quiescent  state  in  which  no  new  transaction  is 
processed  until  the  recovery  is  complete.  The  second  method  creates  lists  of  affected  data  items 
and  affected  transaction  parts  in  the  damage  assessment  phase  and  releases  the  unaffected  part  of 
the  database  for  regular  operations.  This  makes  the  system  available  to  users  while  the  recovery 
process  continued.  All  protocols  and  methods  developed  in  this  research  have  been  substantiated 
with  appropriate  algorithms. 
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